Today we will learn about plotly, an R package that
allows you to build interactive graphs. To shake things up a bit, we’ll
be working with a dataset that contains the information of all
801 Pokémon in the world!
This dataset, from package Rokemon, is based on the hit
video game franchise. Every Pokémon species has its own characteristics
when it comes to height, weight, attack power, defense power, speed, hit
points (stamina), among others. Gotta analyze’em
all!
#Please install the Rokemon package
#Make sure you have the "devtools" package, if not, install it first.
###install.packages("devtools")
###library(devtools)
devtools::install_github("schochastics/Rokemon")
#Load the packages
library(Rokemon)
library(plotly)
library(tidyverse)
library(dplyr)
library(readxl)
#Dataset
pokemon <- pokemon
####Gotta analyze’em all!Ggplot provides meaningful, appealing, but static graphs. This limits the users’ possibilities to analyse the data.
Plotly allows users to interact with graphs on a wide variety of forms: zoom in/out of the plot, hover over a point, filter categories, among others.
Interactive graphs are useful for having a deeper understanding of the patterns of data.
Let’s take a closer look of our dataset.
pok3a <- ggplot(pokemon, aes(x=attack, y=speed)) +
geom_point(shape=1, alpha=0.5) +
ggtitle("Fig. 3A: Attack vs Speed ", subtitle = "Built with Ggplot") +
labs(y="Attack",
x = "Speed",
caption = "Source: Pokemon")+
theme_minimal()
pok3aThis graphic looks cool! We can clearly see a positive correlation between attack and speed. But what would happen if we add the Pokémons’ name to the graph? Imagine that we would like to know to which Pokémon belongs each dot in order to build the fastest and strongest Pokémon team. It would look like this:
pok3b <- ggplot(pokemon, aes(x=attack, y=speed, label= name)) +
geom_point(shape=1, alpha=0.5) +
geom_text(size=3, hjust=1, vjust=1) +
ggtitle("Fig. 3B: Attack vs Speed ", subtitle = "Built with Ggplot") +
labs(y="Attack",
x = "Speed",
caption = "Source: Pokemon")+
theme_minimal()
pok3bGraphic 3B is not clear, you can not identify the Pokémon. So… here comes plotly.
pok3c <- pokemon%>%
plot_ly(x = ~attack, y = ~speed, type = 'scatter', mode = 'markers',text=~name,
marker=list(color="blue", size=10))%>%
layout(title= "Fig. 3C: Attack vs Speed",
xaxis = list(title = list(text = 'Speed')),
yaxis = list(title = list(text = 'Attack')))
pok3cPlotly allows the user to hover over the individual observations and see hover text corresponding to the point on the plot. Now, you can see each Pokémon’s name without clutter.
Much better right? Let’s keep pushing!
pok3d_df <- pokemon
pok3d_df$generation <- as.character(pok3d_df$generation)
# pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
# pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"
pok3d <- pok3d_df %>%
plot_ly(x = ~attack, y = ~speed, color =~generation,
type = 'scatter', mode = 'markers', text= ~name,
colors = c("#D64E12", "#F9A52C", "#EFDF48", "#8BD346", "#60DBE8", "#16A4D8", "#9B5FE0")) %>%
layout(title= "Fig. 3D: Attack vs Speed",
xaxis = list(title = list(text = 'Speed')),
yaxis = list(title = list(text = 'Attack')))
pok3dThis graph allows us to map the Pokémon by the game they first
appeared in (generation).
Plotly lets you filter observations by clicking on the legend. Try singling out only Pokemon from the first generation by double-clicking on “1” on the right-hand side.
Besides scatterplots, you can also create other type of graphics in Plotly such as bubble charts, histograms, box plots, among others. Let’s see some cool examples!
Remember, you can be anything you want to be.
##In this example we want to see if there is any correlation between capture rate and total stats average of Pokémon types using a bubble chart.
pokemon_count <- pokemon %>%
count(type1)
pokemon_sub <- pokemon %>%
group_by(type1) %>%
summarise(ave_capture = mean(capture_rate, na.rm=T),
ave_power = mean(base_total, na.rm = T))
pokemon_sub <- merge(pokemon_sub, pokemon_count, by = "type1")
type_colors <- c('#A8A77A', '#EE8130', '#6390F0', '#F7D02C', '#7AC74C','#96D9D6', '#C22E28', '#A33EA1', '#E2BF65', '#A98FF3', '#F95587', '#A6B91A', '#B6A136', '#735797', '#6F35FC', '#705746', '#B7B7CE', '#D685AD')
pok4a <- plot_ly(pokemon_sub, x = ~ave_power, y = ~ave_capture,
hovertext = ~paste('</br> Name: ', type1,
'</br> Number of Pkmn: ', n),
type = 'scatter', mode = 'markers',
size = ~n, color = ~type1, colors = type_colors,
marker = list(opacity = 0.5, sizemode = 'diameter'))
pok4a <- pok4a %>% layout(title = 'Capture rate vs Base stats total per Pokémon Type',
xaxis = list(title = list(text = 'Base stats total')),
yaxis = list(title = list(text = 'Avergage Capture Rate')),
showlegend = FALSE)
pok4aIn this bubble chart, you can hover over the circles to see what Pokémon type it corresponds to and how many Pokémon belong to that type.
(For Pokémon nerds out there, this plot only takes into account primary typing! So sorry, Flying types.)
pok4b <- pokemon %>%
plot_ly(alpha = 0.6) %>%
add_histogram(x = ~defense,
name = "Defense") %>%
add_histogram(x = ~attack,
name = "Attack") %>%
layout(barmode = "overlay",
title = "Histogram",
xaxis = list(title = "Point Average",
zeroline = FALSE),
yaxis = list(title = "Frequency",
zeroline = FALSE))
pok4bWow! Pokemons’ defense and attack points also follow a normal distribution!
pok4c <- plot_ly(data = pokemon,
y = ~base_total,
x = ~generation,
type = "box",
showlegend = FALSE)
pok4cThis boxplot showcases on average how strong
(base_total) the Pokémon in each generation of games are.
Hovering over each bin with your mouse will reveal summary statistics of
each group.
This tutorial level will walk you through on how to catch your
first Pokémon build your first plotly plot. There are two ways to
do this:
plot_ly() functionggplot_ly() function, which translates a
ggplot into plotlyWe’re going to have a more in-depth look at the first option.
Let’s say that I want to have a look at the Pokémon from the first Pokémon game, released in 1996. My theory is that a Pokémon’s hit points total is positively correlated with its defense stat, as a more protected, defensive Pokémon will last longer in battle.
I can map this out easily with plotly via a scatterplot. To choose
what plot I want, I use the type argument within
plot_ly(). For a full list of attributes and arguments that
can be passed along in plotly, see schema().
Notice that I call the variables with a ~. This is
important to plotly’s syntax.
# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")
# constructing our plotly graph
pok5a <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp, type = 'scatter')
pok5aVisually, there does seem to be a positive correlation with these two statistics.
While it’s a nice scatterplot, this hardly takes advantage of plotly’s interactive capabilities. Let’s make our plot better by introducing detail and plotly’s signature feature, hover text.
# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"
# constructing our plotly graph
pok5b <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp,
type = 'scatter',
color = ~is_legendary,
colors = c('#BF382A', '#0C4B8E'),
opacity = 0.5,
hovertext = ~paste('</br> Name: ', name,
'</br> Species: ', classfication,
'</br> Type: ', type1, '/', type2)) %>%
layout(title = "Hit points by Defense points",
xaxis = list(title = list(text = 'Defense statistic')),
yaxis = list(title = list(text = 'Hit points')))
pok5bI used the following functions to add more detail:
color() allows you to create a legend for your plot.
Here, I separated the observations based on if the Pokémon is legenday
(read: one of a kind) or not.colors() assigns colors to your legend. Supports
hexcodes, RBG values…opacity() modifies observation markers, useful if you
have many overlapping ones.hovertext() is the big deal here. This option is very
flexible, so please see plotly’s documentation for the full scope of
what it’s capable of. Here, I made it to display basic information about
each critter.layout() enables me to adjust the axis labels, but also
supports many other arguments that modify your plot’s final appearance.
See here
for a full list.But plotly doesn’t stop there. As previously shown, the package is capable of rendering many different plots – even 3D ones, taking full advantage of the digital space these plots are intended to be displayed in.
I’ll show you this by adding a z-axis on my previous plot. While I can define the trace type manually, plotly will also automatically assume a 3D scatterplot from my parameters.
# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"
# constructing our plotly graph
pok5c <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp, z = ~height_m,
color = ~is_legendary,
colors = c('#BF382A', '#0C4B8E'),
opacity = 0.5,
hovertext = ~paste('</br> Name: ', name,
'</br> Species: ', classfication,
'</br> Type: ', type1, '/', type2)) %>%
layout(title = "Hit points by Defense points by Height (in meters)",
xaxis = list(title = list(text = 'Defense statistic')),
yaxis = list(title = list(text = 'Hit points')))
pok5cWhile plotly is best used on a digital platform due to its interactive features, there may be a time where you want to use your beautiful, carefully drawn plot in a non-dynamic medium – like, say a paper or a print out.
The orca() function allows you to do this, although it
requires you to download
Orca, an open source command line tool that interacts with plotly.
You can transform your graphs into .jpegs, .pngs, .pdfs…
The installation process is a bit more complicated than your usual R package (see here for full instructions), so we won’t be showing orca() in action. However, this is what code to render your plot into an image would look like:
library(orca)
orca(pok5b, "pok5b.png")Yes, it’s possible! The function ggplotly() allows you
to do this.
Remember this graph from before?
pok3a
Applying our make-over with
ggplot_ly():
pok3a %>%
style(hovertext = ~name) %>%
ggplotly()To change the looks of this graph, you would have to do some
modifications on the ggplot side of things, although some alterations
are possible with style(). It may be better to build the
graph natively in plotly() depending on your
requirements.
The plotly package is a powerful data visualization tool
which can bring your plots and graphs to life by making them more
dynamic and interactive. Like ggplot, it supports a wide variety of
plots, from scatter to histograms to even heatmaps. plotly
is best used in an digital setting, such as in your browser, in order
take full advantage of its functionality.
Plotting is made possible with the plot_ly() function.
The simple function
plot_ly(dataset, x = ~xaxis, y = ~yaxis, type = 'plot')
will get you started with basic functionality. The
hovertext() function allows you to modify information that
is shown when hovering over a point. Many other customizations are
possible (type ?plot_ly() to see more), and the graph
visualization can be adjusted via layout().
With these tips, your plots will be the very best, like no plot ever was!
Here are three exercises to challenge your plotly skills. Using the
pokemon dataset from the package Rokemon, can
you plot out the relationship of height and weight for all Fire-type
Pokemon? Assign legend colors according to generation, and make sure the
hover text includes the pokemon’s name and their classification.
Possible solution below - no cheating! ;)
pokemon_hw <- filter(pokemon, type1 == "fire")
hw_plot <- plot_ly(pokemon_hw, x = ~height_m, y = ~weight_kg,
type = 'scatter',
color = ~generation,
opacity = 0.5,
hovertext = ~paste('</br> Name: ', name,
'</br> Species: ', classfication)) %>%
layout(title = "Relationship between height and weight of Fire-types",
xaxis = list(title = list(text = 'Height (meters)')),
yaxis = list(title = list(text = 'Weight (kilograms)')))
hw_plotWe wanted to see if there was a correlation between weight and attack in ground and water type pokémons, sadly… we couldn’t find any :/ The worst part is that bug pokemons made their thing and ruined our code! Now we can’t show you that there is no correlation. But here you come, pokémon masteR, please find the BUGS in the following code, fix it and bring back the graph! Do you accept the challenge?
challenge2 <- rokemon %>%
filter(type = c("ground","water")) %>%
plot_ly(x = ~attack, y = ~weigth_kg, splt = ~type,
type == 'scatter') %>%
layout(title = 'Weight vs. Attack',
xaxs = list(title = 'Attack'),
yaxis = list(title = 'Weight (kilograms)'),
legend = list(title = list(text = 'Type')),
plot_bgcolor = 'white')
challenge2Empirically speaking, we think that the number of water type pokémons has decreased among generations. Can you plot a graphic to check if there is such pattern? Create a bar plot filtering pokémons by type, include water type -of course- and three more. Add colors according to the type. Share your results!
challenge3 <- pokemon %>%
filter(type1 == c("fire", "ground", "water", "grass")) %>%
count(generation, type1) %>%
plot_ly(x = ~generation, y = ~n, type = 'bar', color = ~type1)
challenge3